Categorizing Concepts for Detecting Drifts in Stream

نویسندگان

  • Sharanjit Kaur
  • Vasudha Bhatnagar
  • Sameep Mehta
  • Sudhir Kapoor
چکیده

Mining evolving data streams for concept drifts has gained importance in applications like customer behavior analysis, network intrusion detection, credit card fraud detection. Several approaches have been proposed for detection of concept drifts in the context of supervised learning in data streams. Recently, researchers have been looking into the problem of identifying concept drifts in unlabeled data streams. Prevalent approaches study the evolution of streaming clusters using intrinsic and extrinsic characteristics of the discovered clusters, where each cluster is considered a concept. In this paper we model an unlabeled, uniform data stream as a stochastic poisson process and study the arrival pattern of data points to analyse the nature of an evolving concept (cluster). Each concept is modeled as stochastic poisson process and is individually observed for arrival rates of the incoming data points. A random sample of arrival rates is collected for each concept and appropriate non-parametric tests are applied to infer the nature of evolution for the concept. Concept drift in the stream can be inferred by the overall behavior of the concepts. We also propose a taxonomy of various types of concept behaviors and inter-relation among them. Experiments have been performed to demonstrate feasibility, validity and scalability of the proposed method.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Detecting Concept Drift in Data Stream Using Semi-Supervised Classification

Data stream is a sequence of data generated from various information sources at a high speed and high volume. Classifying data streams faces the three challenges of unlimited length, online processing, and concept drift. In related research, to meet the challenge of unlimited stream length, commonly the stream is divided into fixed size windows or gradual forgetting is used. Concept drift refer...

متن کامل

Towards incremental learning of nonstationary imbalanced data stream: a multiple selectively recursive approach

Difficulties of learning from nonstationary data stream are generally twofold. First, dynamically structured learning framework is required to catch up with the evolution of unstable class concepts, i.e., concept drifts. Second, imbalanced class distribution over data stream demands a mechanism to intensify the underrepresented class concepts for improved overall performance. To alleviate the c...

متن کامل

Reservoir of Diverse Adaptive Learners and Stacking Fast Hoeffding Drift Detection Methods for Evolving Data Streams

The last decade has seen a surge of interest in adaptive learning algorithms for data stream classification, with applications ranging from predicting ozone level peaks, learning stock market indicators, to detecting computer security violations. In addition, a number of methods have been developed to detect concept drifts in these streams. Consider a scenario where we have a number of classifi...

متن کامل

Concept Drift Detection for Imbalanced Stream Data

Common statistical prediction models often require and assume stationarity in the data. However, in many practical applications, changes in the relationship of the response and predictor variables are regularly observed over time, resulting in the deterioration of the predictive performance of these models. This paper presents Linear Four Rates (LFR), a framework for detecting these concept dri...

متن کامل

Significance of the Metaphor of the Lion in Categorizing Mystic Concepts in Mulavi (A Study of Existing Examples in Ghazals from Divan Kabir)

Sufi and mystic poets employ linguistic evidence, especially expressions regarding animals, to represent mystic concepts. In this study, to explore and clarify the meanings Mulana intended to convey as the field of destination, we will examine the linguistic expression “the Lion” as the field of origination in ghazals of Divan Kabir by using the conceptual theories of metaphor introduced by Geo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009